-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizer: Fix range extraction for CNF(conjunctive normal form) | tidb-test=pr/2341 #53908
Optimizer: Fix range extraction for CNF(conjunctive normal form) | tidb-test=pr/2341 #53908
Conversation
Hi @ghazalfamilyusa. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
b56c9b2
to
8b90abf
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #53908 +/- ##
=================================================
- Coverage 70.9866% 55.9975% -14.9892%
=================================================
Files 1507 1629 +122
Lines 413072 607643 +194571
=================================================
+ Hits 293226 340265 +47039
- Misses 100540 244184 +143644
- Partials 19306 23194 +3888
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/ok-to-test |
8b90abf
to
570e2f1
Compare
20166dc
to
8525db7
Compare
/test unit-test |
@ghazalfamilyusa: The specified target(s) for
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test fast_test_tiprow |
@ghazalfamilyusa: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
8525db7
to
aa03831
Compare
/test fast_test_tiprow |
@ghazalfamilyusa: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc zanmato1984 elsa0520 |
/cc @XuHuaiyu for expression changes |
/cc @qw4990 |
What's different between with |
@elsa0520 ; this is what I talked about in the design document and in the summary above. The code will take the point ranges path when you have multiple conjuncts regardless if these conjuncts are relevant or not like (d > 3). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expression change LGTM
[LGTM Timeline notifier]Timeline:
|
But in the design we will deal with the CNF intersection. So after you support Computing intersection for CNF . We can merge the ((a = 1 and b = 2 and c > 3) or (a = 4 and b = 5 and c > 6)) and d>3 . Then we don't need to support PR alone, isn't it? |
@elsa0520 : seems my previous reply is not clear. The intersection is not 100% guaranteed at least for a while. Initially, we will have a flag and later on we may implement a heuristic solution. So, we need the point ranges in case the flag is off or if the intersection is not done for all the conjuncts. Now, with the reality of having point ranges (just in case) we still need to choose bestCNF over range. we can remove point ranges support and related PRs once we have: 100% intersection applied with no flag. Hope this makes sense and if not we can chat |
@@ -462,6 +463,18 @@ func (d *rangeDetacher) detachCNFCondAndBuildRangeForIndex(conditions []expressi | |||
// TODO: we will optimize it later. | |||
res.RemainedConds = AppendConditionsIfNotExist(res.RemainedConds, remainedConds) | |||
res.Ranges = ranges | |||
if bestCNFItemRes != nil && res != nil && len(res.Ranges) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please comment this case " (heuristics applied for long lists or we turn off the intersection)" in front of here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elsa0520, XuHuaiyu, zanmato1984 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
48f41e0
to
bfad4b7
Compare
bfad4b7
to
9c7f0c4
Compare
What problem does this PR solve?
Issue Number: Ref #41598
Problem Summary:
Index range extraction in the optimizer can handle disjunctions like
((a = 1 and b = 2 and c > 3) or (a = 4 and b = 5 and c > 6)) and builds the proper index range like "(1 2 3,1 2 +inf], (4 5 6,4 5 +inf]" for this example. The problem is that when we add another conjunct like d > 3 then the optimizer takes another path which find only the point ranges "[1,1], [4,4]" in this example. Finding point ranges is OK for complex CNF where we only pick ranges from only one conjunct, But, in this case, the best conjunct is also better than point ranges.
What changed and how does it work?
We added code that picks the best conjunct ranges if it is more selective than the point ranges. For example, "(1 2 3,1 2 +inf], (4 5 6,4 5 +inf]" is more selective than "[1,1], [4,4]" . We implemented that by checking if the best CNF ranges is a subset of the point ranges.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.